Document Clustering Using the 1 + 1 Dimensional Self-Organising Map
نویسندگان
چکیده
Automatic clustering of documents is a task that has become increasingly important with the explosion of online information. The SelfOrganising Map (SOM) has been used to cluster documents effectively, but efforts to date have used a single or a series of 2-dimensional maps. Ideally, the output of a document-clustering algorithm should be easy for a user to interpret. This paper describes a method of clustering documents using a series of 1dimensional SOM arranged hierarchically to provide an intuitive tree structure representing document clusters. Wordnet1 is used to find the base forms of words and only cluster on words that can be nouns.
منابع مشابه
On Document Classification with Self-Organising Maps
This research deals with the use of self-organising maps for the classification of text documents. The aim was to classify documents to separate classes according to their topics. We therefore constructed self-organising maps that were effective for this task and tested them with German newspaper documents. We compared the results gained to those of k nearest neighbour searching and k-means clu...
متن کاملSelf-Organising Maps for Tree View Based Hierarchical Document Clustering
In this paper we investigate the use of SelfOrganising Maps (SOM) for document clustering. Previous methods using the SOM to cluster documents have used twodimensional maps. This paper presents a hierarchical and growing method using a series of one-dimensional maps instead. Using this type of SOM is an efficient method for clustering documents and browsing them in a dynamically generated tree ...
متن کاملSelf-Organising Maps in Document Classification: A Comparison with Six Machine Learning Methods
This paper focuses on the use of self-organising maps, also known as Kohonen maps, for the classification task of text documents. The aim is to effectively and automatically classify documents to separate classes based on their topics. The classification with self-organising map was tested with three data sets and the results were then compared to those of six well known baseline methods: k-mea...
متن کاملA novel self-organising clustering model for time-event documents
Purpose Neural document clustering techniques, e.g., self-organising map (SOM) or growing neural gas (GNG), usually assume that textual information is stationary on the quantity. However, the quantity of text is ever-increasing. We propose a novel dynamic adaptive self-organising hybrid (DASH) model, which adapts to time-event news collections not only to the neural topological structure but al...
متن کاملAdaptive topological tree structure for document organisation and visualisation
The self-organising map (SOM) is finding more and more applications in a wide range of fields, such as clustering, pattern recognition and visualisation. It has also been employed in knowledge management and information retrieval. We propose an alternative to existing 2-dimensional SOM based methods for document analysis. The method, termed Adaptive Topological Tree Structure (ATTS), generates ...
متن کامل